Quality Analysis of Macroprosodic F0 Dynamics in Text-to-Speech Signals
نویسندگان
چکیده
We present a study on the relation between fundamental frequency (F0) and its perceptual effect in the context of text-tospeech (TTS) synthesis. Features that essentially capture the intonational (macro-prosodic) properties of spoken speech are introduced and analysed with regard to the following questions: (i) How does the prosodic variation of TTS signals differ from natural speech? (ii) Is there a functional relationship between the prosodic variation of TTS signals and its perceived quality? In answering these questions we present novel approaches for the construction of non-intrusive quality estimators. The results reveal a substantial degree of systematic influence of prosodic variation on TTS quality.
منابع مشابه
High-quality prosodic modification of speech signals
The aim of this work was to develop a procedure that allows prosodic modiications of speech signals without impairing the quality. The developed procedure is based on the Fourier analysis/synthesis technique with several improvements on the analysis side, such as the analysis of signals with rapidly changing F0 and the analysis of weak spectral components. Also for the modiication of the short-...
متن کاملStylisation and Symbolic Coding of F0 : a Quantitative Model
This paper presents a reversible model for the stylisation and the symbolic coding of macroprosodic fundamental frequency patterns. Prosodic labels are generated automatically from the speech signal and can be used to regenerate a synthetic F0 curve which is as close as possible to the original curve. The model has been tested successfully for 20 speakers in French and Italian.
متن کاملWord segmentation in Persian continuous speech using F0 contour
Word segmentation in continuous speech is a complex cognitive process. Previous research on spoken word segmentation has revealed that in fixed-stress languages, listeners use acoustic cues to stress to de-segment speech into words. It has been further assumed that stress in non-final or non-initial position hinders the demarcative function of this prosodic factor. In Persian, stress is retract...
متن کاملA New Approach to Detect Congestive Heart Failure Using Symbolic Dynamics Analysis of Electrocardiogram Signal
The aim of this study is to show that the measures derived from Electrocardiogram (ECG) signals many a time perform better than the same measures obtained from heart rate (HR) signals. A comparison was made to investigate how far the nonlinear symbolic dynamics approach helps to characterize the nonlinear properties of ECG signals and HR signals, and thereby discriminate between normal and cong...
متن کاملUsing Noisy Speech to Study the Robustness of a Continuous F0 Modelling Method in HMM-based Speech Synthesis
In parametric text-to-speech synthesis using Hidden Markov Model (HMM), the fundamental frequency (F0) parameter modelling is important because it has a direct effect on the prosody of synthetic speech. F0 is typically modelled by a discrete distribution for unvoiced speech and a continuous distribution for voiced, by using a multi-space distribution (MSD). However, F0 modelling using MSD-HMM i...
متن کامل